Dataset statistics
| Number of variables | 13 |
|---|---|
| Number of observations | 740 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 75.3 KiB |
| Average record size in memory | 104.2 B |
Variable types
| Numeric | 6 |
|---|---|
| Categorical | 7 |
df_index is highly correlated with age and 4 other fields | High correlation |
age is highly correlated with df_index | High correlation |
cp is highly correlated with exang | High correlation |
chol is highly correlated with df_index and 1 other fields | High correlation |
restecg is highly correlated with df_index | High correlation |
thalach is highly correlated with exang | High correlation |
exang is highly correlated with cp and 2 other fields | High correlation |
oldpeak is highly correlated with exang and 1 other fields | High correlation |
num is highly correlated with df_index and 1 other fields | High correlation |
dataset is highly correlated with df_index and 1 other fields | High correlation |
df_index has unique values | Unique |
chol has 79 (10.7%) zeros | Zeros |
oldpeak has 329 (44.5%) zeros | Zeros |
Reproduction
| Analysis started | 2022-10-17 20:05:27.779083 |
|---|---|
| Analysis finished | 2022-10-17 20:05:30.803593 |
| Duration | 3.02 seconds |
| Software version | pandas-profiling v3.3.0 |
| Download configuration | config.json |
| Distinct | 740 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 419.1108108 |
| Minimum | 0 |
|---|---|
| Maximum | 919 |
| Zeros | 1 |
| Zeros (%) | 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 40.95 |
| Q1 | 211.75 |
| median | 402.5 |
| Q3 | 587.25 |
| 95-th percentile | 860.15 |
| Maximum | 919 |
| Range | 919 |
| Interquartile range (IQR) | 375.5 |
Descriptive statistics
| Standard deviation | 253.0874974 |
|---|---|
| Coefficient of variation (CV) | 0.6038677383 |
| Kurtosis | -0.9496211708 |
| Mean | 419.1108108 |
| Median Absolute Deviation (MAD) | 188 |
| Skewness | 0.2394276424 |
| Sum | 310142 |
| Variance | 64053.28134 |
| Monotonicity | Strictly increasing |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 1 | 0.1% |
| 530 | 1 | 0.1% |
| 521 | 1 | 0.1% |
| 522 | 1 | 0.1% |
| 523 | 1 | 0.1% |
| 524 | 1 | 0.1% |
| 525 | 1 | 0.1% |
| 526 | 1 | 0.1% |
| 527 | 1 | 0.1% |
| 528 | 1 | 0.1% |
| Other values (730) | 730 |
| Value | Count | Frequency (%) |
| 0 | 1 | |
| 1 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 7 | 1 | |
| 8 | 1 | |
| 9 | 1 | |
| 10 | 1 |
| Value | Count | Frequency (%) |
| 919 | 1 | |
| 917 | 1 | |
| 915 | 1 | |
| 914 | 1 | |
| 913 | 1 | |
| 912 | 1 | |
| 911 | 1 | |
| 910 | 1 | |
| 909 | 1 | |
| 908 | 1 |
| Distinct | 50 |
|---|---|
| Distinct (%) | 6.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 53.0972973 |
| Minimum | 28 |
|---|---|
| Maximum | 77 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.9 KiB |
Quantile statistics
| Minimum | 28 |
|---|---|
| 5-th percentile | 37 |
| Q1 | 46 |
| median | 54 |
| Q3 | 60 |
| 95-th percentile | 68 |
| Maximum | 77 |
| Range | 49 |
| Interquartile range (IQR) | 14 |
Descriptive statistics
| Standard deviation | 9.408126694 |
|---|---|
| Coefficient of variation (CV) | 0.1771865457 |
| Kurtosis | -0.4197528623 |
| Mean | 53.0972973 |
| Median Absolute Deviation (MAD) | 7 |
| Skewness | -0.161467334 |
| Sum | 39292 |
| Variance | 88.5128479 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 54 | 43 | 5.8% |
| 58 | 38 | 5.1% |
| 55 | 33 | 4.5% |
| 57 | 31 | 4.2% |
| 56 | 30 | 4.1% |
| 52 | 30 | 4.1% |
| 59 | 28 | 3.8% |
| 51 | 25 | 3.4% |
| 53 | 24 | 3.2% |
| 60 | 24 | 3.2% |
| Other values (40) | 434 |
| Value | Count | Frequency (%) |
| 28 | 1 | 0.1% |
| 29 | 2 | 0.3% |
| 30 | 1 | 0.1% |
| 31 | 2 | 0.3% |
| 32 | 4 | 0.5% |
| 33 | 2 | 0.3% |
| 34 | 6 | |
| 35 | 9 | |
| 36 | 5 | |
| 37 | 11 |
| Value | Count | Frequency (%) |
| 77 | 2 | 0.3% |
| 76 | 1 | 0.1% |
| 75 | 3 | 0.4% |
| 74 | 4 | |
| 73 | 1 | 0.1% |
| 72 | 1 | 0.1% |
| 71 | 4 | |
| 70 | 7 | |
| 69 | 7 | |
| 68 | 8 |
sex
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.9 KiB |
| 1.0 | |
|---|---|
| 0.0 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 2220 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1.0 |
|---|---|
| 2nd row | 1.0 |
| 3rd row | 0.0 |
| 4th row | 0.0 |
| 5th row | 0.0 |
Common Values
| Value | Count | Frequency (%) |
| 1.0 | 566 | |
| 0.0 | 174 | 23.5% |
Length
Histogram of lengths of the category
Category Frequency Plot
| Value | Count | Frequency (%) |
| 1.0 | 566 | |
| 0.0 | 174 | 23.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 914 | |
| . | 740 | |
| 1 | 566 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1480 | |
| Other Punctuation | 740 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 914 | |
| 1 | 566 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 740 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 2220 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 914 | |
| . | 740 | |
| 1 | 566 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2220 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 914 | |
| . | 740 | |
| 1 | 566 |
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.9 KiB |
| 4.0 | |
|---|---|
| 3.0 | |
| 2.0 | |
| 1.0 | 37 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 2220 |
|---|---|
| Distinct characters | 6 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2.0 |
|---|---|
| 2nd row | 2.0 |
| 3rd row | 1.0 |
| 4th row | 2.0 |
| 5th row | 2.0 |
Common Values
| Value | Count | Frequency (%) |
| 4.0 | 392 | |
| 3.0 | 161 | |
| 2.0 | 150 | 20.3% |
| 1.0 | 37 | 5.0% |
Length
Histogram of lengths of the category
Category Frequency Plot
| Value | Count | Frequency (%) |
| 4.0 | 392 | |
| 3.0 | 161 | |
| 2.0 | 150 | 20.3% |
| 1.0 | 37 | 5.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| . | 740 | |
| 0 | 740 | |
| 4 | 392 | |
| 3 | 161 | 7.3% |
| 2 | 150 | 6.8% |
| 1 | 37 | 1.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1480 | |
| Other Punctuation | 740 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 740 | |
| 4 | 392 | |
| 3 | 161 | 10.9% |
| 2 | 150 | 10.1% |
| 1 | 37 | 2.5% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 740 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 2220 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| . | 740 | |
| 0 | 740 | |
| 4 | 392 | |
| 3 | 161 | 7.3% |
| 2 | 150 | 6.8% |
| 1 | 37 | 1.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2220 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| . | 740 | |
| 0 | 740 | |
| 4 | 392 | |
| 3 | 161 | 7.3% |
| 2 | 150 | 6.8% |
| 1 | 37 | 1.7% |
trestbps
Real number (ℝ≥0)
| Distinct | 58 |
|---|---|
| Distinct (%) | 7.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 132.7540541 |
| Minimum | 0 |
|---|---|
| Maximum | 200 |
| Zeros | 1 |
| Zeros (%) | 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 110 |
| Q1 | 120 |
| median | 130 |
| Q3 | 140 |
| 95-th percentile | 160.2 |
| Maximum | 200 |
| Range | 200 |
| Interquartile range (IQR) | 20 |
Descriptive statistics
| Standard deviation | 18.58124966 |
|---|---|
| Coefficient of variation (CV) | 0.1399674743 |
| Kurtosis | 3.80309551 |
| Mean | 132.7540541 |
| Median Absolute Deviation (MAD) | 10 |
| Skewness | 0.1784665755 |
| Sum | 98238 |
| Variance | 345.2628388 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 120 | 119 | |
| 130 | 102 | |
| 140 | 86 | 11.6% |
| 150 | 50 | 6.8% |
| 110 | 45 | 6.1% |
| 160 | 43 | 5.8% |
| 125 | 23 | 3.1% |
| 128 | 17 | 2.3% |
| 135 | 15 | 2.0% |
| 138 | 15 | 2.0% |
| Other values (48) | 225 |
| Value | Count | Frequency (%) |
| 0 | 1 | 0.1% |
| 92 | 1 | 0.1% |
| 94 | 2 | 0.3% |
| 96 | 1 | 0.1% |
| 98 | 1 | 0.1% |
| 100 | 10 | |
| 101 | 1 | 0.1% |
| 102 | 2 | 0.3% |
| 104 | 2 | 0.3% |
| 105 | 5 |
| Value | Count | Frequency (%) |
| 200 | 3 | 0.4% |
| 192 | 1 | 0.1% |
| 190 | 2 | 0.3% |
| 185 | 1 | 0.1% |
| 180 | 10 | |
| 178 | 3 | 0.4% |
| 174 | 1 | 0.1% |
| 172 | 2 | 0.3% |
| 170 | 12 | |
| 165 | 1 | 0.1% |
| Distinct | 208 |
|---|---|
| Distinct (%) | 28.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 220.1364865 |
| Minimum | 0 |
|---|---|
| Maximum | 603 |
| Zeros | 79 |
| Zeros (%) | 10.7% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.9 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 197 |
| median | 231 |
| Q3 | 271 |
| 95-th percentile | 336.05 |
| Maximum | 603 |
| Range | 603 |
| Interquartile range (IQR) | 74 |
Descriptive statistics
| Standard deviation | 93.61455549 |
|---|---|
| Coefficient of variation (CV) | 0.4252568803 |
| Kurtosis | 1.81688245 |
| Mean | 220.1364865 |
| Median Absolute Deviation (MAD) | 37 |
| Skewness | -0.8091404209 |
| Sum | 162901 |
| Variance | 8763.685 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 0 | 79 | 10.7% |
| 220 | 10 | 1.4% |
| 254 | 10 | 1.4% |
| 230 | 9 | 1.2% |
| 223 | 9 | 1.2% |
| 260 | 8 | 1.1% |
| 211 | 8 | 1.1% |
| 246 | 8 | 1.1% |
| 216 | 8 | 1.1% |
| 219 | 8 | 1.1% |
| Other values (198) | 583 |
| Value | Count | Frequency (%) |
| 0 | 79 | |
| 85 | 1 | 0.1% |
| 100 | 2 | 0.3% |
| 117 | 1 | 0.1% |
| 126 | 1 | 0.1% |
| 129 | 1 | 0.1% |
| 131 | 1 | 0.1% |
| 132 | 1 | 0.1% |
| 141 | 1 | 0.1% |
| 147 | 2 | 0.3% |
| Value | Count | Frequency (%) |
| 603 | 1 | |
| 564 | 1 | |
| 529 | 1 | |
| 518 | 1 | |
| 491 | 1 | |
| 458 | 1 | |
| 417 | 1 | |
| 412 | 1 | |
| 409 | 1 | |
| 407 | 1 |
fbs
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.9 KiB |
| 0.0 | |
|---|---|
| 1.0 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 2220 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0.0 |
|---|---|
| 2nd row | 0.0 |
| 3rd row | 0.0 |
| 4th row | 0.0 |
| 5th row | 0.0 |
Common Values
| Value | Count | Frequency (%) |
| 0.0 | 629 | |
| 1.0 | 111 | 15.0% |
Length
Histogram of lengths of the category
Category Frequency Plot
| Value | Count | Frequency (%) |
| 0.0 | 629 | |
| 1.0 | 111 | 15.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 1369 | |
| . | 740 | |
| 1 | 111 | 5.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1480 | |
| Other Punctuation | 740 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 1369 | |
| 1 | 111 | 7.5% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 740 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 2220 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 1369 | |
| . | 740 | |
| 1 | 111 | 5.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2220 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 1369 | |
| . | 740 | |
| 1 | 111 | 5.0% |
| Distinct | 3 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.9 KiB |
| 0.0 | |
|---|---|
| 2.0 | |
| 1.0 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 2220 |
|---|---|
| Distinct characters | 4 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2.0 |
|---|---|
| 2nd row | 0.0 |
| 3rd row | 1.0 |
| 4th row | 1.0 |
| 5th row | 0.0 |
Common Values
| Value | Count | Frequency (%) |
| 0.0 | 445 | |
| 2.0 | 175 | 23.6% |
| 1.0 | 120 | 16.2% |
Length
Histogram of lengths of the category
Category Frequency Plot
| Value | Count | Frequency (%) |
| 0.0 | 445 | |
| 2.0 | 175 | 23.6% |
| 1.0 | 120 | 16.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 1185 | |
| . | 740 | |
| 2 | 175 | 7.9% |
| 1 | 120 | 5.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1480 | |
| Other Punctuation | 740 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 1185 | |
| 2 | 175 | 11.8% |
| 1 | 120 | 8.1% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 740 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 2220 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 1185 | |
| . | 740 | |
| 2 | 175 | 7.9% |
| 1 | 120 | 5.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2220 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 1185 | |
| . | 740 | |
| 2 | 175 | 7.9% |
| 1 | 120 | 5.4% |
| Distinct | 115 |
|---|---|
| Distinct (%) | 15.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 138.7445946 |
| Minimum | 60 |
|---|---|
| Maximum | 202 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 5.9 KiB |
Quantile statistics
| Minimum | 60 |
|---|---|
| 5-th percentile | 96 |
| Q1 | 120 |
| median | 140 |
| Q3 | 159.25 |
| 95-th percentile | 179 |
| Maximum | 202 |
| Range | 142 |
| Interquartile range (IQR) | 39.25 |
Descriptive statistics
| Standard deviation | 25.84608152 |
|---|---|
| Coefficient of variation (CV) | 0.1862853223 |
| Kurtosis | -0.5418561084 |
| Mean | 138.7445946 |
| Median Absolute Deviation (MAD) | 20 |
| Skewness | -0.2298360761 |
| Sum | 102671 |
| Variance | 668.0199301 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) |
| 150 | 38 | 5.1% |
| 140 | 37 | 5.0% |
| 120 | 26 | 3.5% |
| 130 | 26 | 3.5% |
| 160 | 22 | 3.0% |
| 170 | 18 | 2.4% |
| 125 | 18 | 2.4% |
| 110 | 15 | 2.0% |
| 115 | 14 | 1.9% |
| 142 | 14 | 1.9% |
| Other values (105) | 512 |
| Value | Count | Frequency (%) |
| 60 | 1 | 0.1% |
| 63 | 1 | 0.1% |
| 69 | 1 | 0.1% |
| 71 | 1 | 0.1% |
| 72 | 1 | 0.1% |
| 73 | 1 | 0.1% |
| 80 | 2 | |
| 82 | 1 | 0.1% |
| 83 | 1 | 0.1% |
| 84 | 3 |
| Value | Count | Frequency (%) |
| 202 | 1 | 0.1% |
| 195 | 1 | 0.1% |
| 194 | 1 | 0.1% |
| 192 | 1 | 0.1% |
| 190 | 2 | |
| 188 | 1 | 0.1% |
| 187 | 1 | 0.1% |
| 186 | 2 | |
| 185 | 4 | |
| 184 | 4 |
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.9 KiB |
| 0.0 | |
|---|---|
| 1.0 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 2220 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0.0 |
|---|---|
| 2nd row | 0.0 |
| 3rd row | 0.0 |
| 4th row | 0.0 |
| 5th row | 0.0 |
Common Values
| Value | Count | Frequency (%) |
| 0.0 | 444 | |
| 1.0 | 296 |
Length
Histogram of lengths of the category
Category Frequency Plot
| Value | Count | Frequency (%) |
| 0.0 | 444 | |
| 1.0 | 296 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 1184 | |
| . | 740 | |
| 1 | 296 | 13.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1480 | |
| Other Punctuation | 740 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 1184 | |
| 1 | 296 | 20.0% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 740 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 2220 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 1184 | |
| . | 740 | |
| 1 | 296 | 13.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2220 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 1184 | |
| . | 740 | |
| 1 | 296 | 13.3% |
| Distinct | 44 |
|---|---|
| Distinct (%) | 5.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.8943243243 |
| Minimum | -1 |
|---|---|
| Maximum | 6.2 |
| Zeros | 329 |
| Zeros (%) | 44.5% |
| Negative | 2 |
| Negative (%) | 0.3% |
| Memory size | 5.9 KiB |
Quantile statistics
| Minimum | -1 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0.5 |
| Q3 | 1.5 |
| 95-th percentile | 3 |
| Maximum | 6.2 |
| Range | 7.2 |
| Interquartile range (IQR) | 1.5 |
Descriptive statistics
| Standard deviation | 1.08715975 |
|---|---|
| Coefficient of variation (CV) | 1.215621359 |
| Kurtosis | 1.248792798 |
| Mean | 0.8943243243 |
| Median Absolute Deviation (MAD) | 0.5 |
| Skewness | 1.206874625 |
| Sum | 661.8 |
| Variance | 1.181916322 |
| Monotonicity | Not monotonic |
Histogram with fixed size bins (bins=44)
| Value | Count | Frequency (%) |
| 0 | 329 | |
| 1 | 73 | 9.9% |
| 2 | 64 | 8.6% |
| 1.5 | 38 | 5.1% |
| 3 | 26 | 3.5% |
| 0.5 | 18 | 2.4% |
| 1.2 | 17 | 2.3% |
| 0.8 | 15 | 2.0% |
| 0.6 | 14 | 1.9% |
| 1.4 | 13 | 1.8% |
| Other values (34) | 133 |
| Value | Count | Frequency (%) |
| -1 | 1 | 0.1% |
| -0.5 | 1 | 0.1% |
| 0 | 329 | |
| 0.1 | 7 | 0.9% |
| 0.2 | 12 | 1.6% |
| 0.3 | 4 | 0.5% |
| 0.4 | 9 | 1.2% |
| 0.5 | 18 | 2.4% |
| 0.6 | 14 | 1.9% |
| 0.7 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 6.2 | 1 | 0.1% |
| 5.6 | 1 | 0.1% |
| 5 | 1 | 0.1% |
| 4.4 | 1 | 0.1% |
| 4.2 | 2 | 0.3% |
| 4 | 8 | |
| 3.8 | 1 | 0.1% |
| 3.6 | 4 | |
| 3.5 | 1 | 0.1% |
| 3.4 | 3 | 0.4% |
| Distinct | 5 |
|---|---|
| Distinct (%) | 0.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.9 KiB |
| 0 | |
|---|---|
| 1 | |
| 2 | |
| 3 | |
| 4 | 22 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 740 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 357 | |
| 1 | 204 | |
| 2 | 79 | 10.7% |
| 3 | 78 | 10.5% |
| 4 | 22 | 3.0% |
Length
Histogram of lengths of the category
Category Frequency Plot
| Value | Count | Frequency (%) |
| 0 | 357 | |
| 1 | 204 | |
| 2 | 79 | 10.7% |
| 3 | 78 | 10.5% |
| 4 | 22 | 3.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 357 | |
| 1 | 204 | |
| 2 | 79 | 10.7% |
| 3 | 78 | 10.5% |
| 4 | 22 | 3.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 740 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 357 | |
| 1 | 204 | |
| 2 | 79 | 10.7% |
| 3 | 78 | 10.5% |
| 4 | 22 | 3.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 740 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 357 | |
| 1 | 204 | |
| 2 | 79 | 10.7% |
| 3 | 78 | 10.5% |
| 4 | 22 | 3.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 740 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 357 | |
| 1 | 204 | |
| 2 | 79 | 10.7% |
| 3 | 78 | 10.5% |
| 4 | 22 | 3.0% |
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 5.9 KiB |
| cleveland | |
|---|---|
| hungarian | |
| va | |
| switzerland |
Length
| Max length | 11 |
|---|---|
| Median length | 9 |
| Mean length | 7.894594595 |
| Min length | 2 |
Characters and Unicode
| Total characters | 5842 |
|---|---|
| Distinct characters | 16 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | hungarian |
|---|---|
| 2nd row | hungarian |
| 3rd row | hungarian |
| 4th row | hungarian |
| 5th row | hungarian |
Common Values
| Value | Count | Frequency (%) |
| cleveland | 303 | |
| hungarian | 261 | |
| va | 130 | |
| switzerland | 46 | 6.2% |
Length
Histogram of lengths of the category
Category Frequency Plot
| Value | Count | Frequency (%) |
| cleveland | 303 | |
| hungarian | 261 | |
| va | 130 | |
| switzerland | 46 | 6.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 1001 | |
| n | 871 | |
| l | 652 | |
| e | 652 | |
| v | 433 | |
| d | 349 | 6.0% |
| r | 307 | 5.3% |
| i | 307 | 5.3% |
| c | 303 | 5.2% |
| h | 261 | 4.5% |
| Other values (6) | 706 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 5842 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 1001 | |
| n | 871 | |
| l | 652 | |
| e | 652 | |
| v | 433 | |
| d | 349 | 6.0% |
| r | 307 | 5.3% |
| i | 307 | 5.3% |
| c | 303 | 5.2% |
| h | 261 | 4.5% |
| Other values (6) | 706 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 5842 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 1001 | |
| n | 871 | |
| l | 652 | |
| e | 652 | |
| v | 433 | |
| d | 349 | 6.0% |
| r | 307 | 5.3% |
| i | 307 | 5.3% |
| c | 303 | 5.2% |
| h | 261 | 4.5% |
| Other values (6) | 706 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 5842 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 1001 | |
| n | 871 | |
| l | 652 | |
| e | 652 | |
| v | 433 | |
| d | 349 | 6.0% |
| r | 307 | 5.3% |
| i | 307 | 5.3% |
| c | 303 | 5.2% |
| h | 261 | 4.5% |
| Other values (6) | 706 |
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here. A simple visualization of nullity by column.
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
First rows
| df_index | age | sex | cp | trestbps | chol | fbs | restecg | thalach | exang | oldpeak | num | dataset | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | 28.0 | 1.0 | 2.0 | 130.0 | 132.0 | 0.0 | 2.0 | 185.0 | 0.0 | 0.0 | 0 | hungarian |
| 1 | 1 | 29.0 | 1.0 | 2.0 | 120.0 | 243.0 | 0.0 | 0.0 | 160.0 | 0.0 | 0.0 | 0 | hungarian |
| 2 | 3 | 30.0 | 0.0 | 1.0 | 170.0 | 237.0 | 0.0 | 1.0 | 170.0 | 0.0 | 0.0 | 0 | hungarian |
| 3 | 4 | 31.0 | 0.0 | 2.0 | 100.0 | 219.0 | 0.0 | 1.0 | 150.0 | 0.0 | 0.0 | 0 | hungarian |
| 4 | 5 | 32.0 | 0.0 | 2.0 | 105.0 | 198.0 | 0.0 | 0.0 | 165.0 | 0.0 | 0.0 | 0 | hungarian |
| 5 | 6 | 32.0 | 1.0 | 2.0 | 110.0 | 225.0 | 0.0 | 0.0 | 184.0 | 0.0 | 0.0 | 0 | hungarian |
| 6 | 7 | 32.0 | 1.0 | 2.0 | 125.0 | 254.0 | 0.0 | 0.0 | 155.0 | 0.0 | 0.0 | 0 | hungarian |
| 7 | 8 | 33.0 | 1.0 | 3.0 | 120.0 | 298.0 | 0.0 | 0.0 | 185.0 | 0.0 | 0.0 | 0 | hungarian |
| 8 | 9 | 34.0 | 0.0 | 2.0 | 130.0 | 161.0 | 0.0 | 0.0 | 190.0 | 0.0 | 0.0 | 0 | hungarian |
| 9 | 10 | 34.0 | 1.0 | 2.0 | 150.0 | 214.0 | 0.0 | 1.0 | 168.0 | 0.0 | 0.0 | 0 | hungarian |
Last rows
| df_index | age | sex | cp | trestbps | chol | fbs | restecg | thalach | exang | oldpeak | num | dataset | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 730 | 908 | 74.0 | 1.0 | 4.0 | 155.0 | 310.0 | 0.0 | 0.0 | 112.0 | 1.0 | 1.5 | 2 | va |
| 731 | 909 | 68.0 | 1.0 | 3.0 | 134.0 | 254.0 | 1.0 | 0.0 | 151.0 | 1.0 | 0.0 | 0 | va |
| 732 | 910 | 51.0 | 0.0 | 4.0 | 114.0 | 258.0 | 1.0 | 2.0 | 96.0 | 0.0 | 1.0 | 0 | va |
| 733 | 911 | 62.0 | 1.0 | 4.0 | 160.0 | 254.0 | 1.0 | 1.0 | 108.0 | 1.0 | 3.0 | 4 | va |
| 734 | 912 | 53.0 | 1.0 | 4.0 | 144.0 | 300.0 | 1.0 | 1.0 | 128.0 | 1.0 | 1.5 | 3 | va |
| 735 | 913 | 62.0 | 1.0 | 4.0 | 158.0 | 170.0 | 0.0 | 1.0 | 138.0 | 1.0 | 0.0 | 1 | va |
| 736 | 914 | 46.0 | 1.0 | 4.0 | 134.0 | 310.0 | 0.0 | 0.0 | 126.0 | 0.0 | 0.0 | 2 | va |
| 737 | 915 | 54.0 | 0.0 | 4.0 | 127.0 | 333.0 | 1.0 | 1.0 | 154.0 | 0.0 | 0.0 | 1 | va |
| 738 | 917 | 55.0 | 1.0 | 4.0 | 122.0 | 223.0 | 1.0 | 1.0 | 100.0 | 0.0 | 0.0 | 2 | va |
| 739 | 919 | 62.0 | 1.0 | 2.0 | 120.0 | 254.0 | 0.0 | 2.0 | 93.0 | 1.0 | 0.0 | 1 | va |